Asymptotically efficient adaptive allocation rules
                    
                        
                            نویسندگان
                            
                            
                        
                        
                    
                    
                    چکیده
منابع مشابه
Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching - Automatic Control, IEEE Transactions on
We consider multiarmed bandit problems with switching cost, define uniformly good allocation rules, and restrict attention to such rules. We present a lower bound on the asymptotic performance of uniformly good allocation rules and construct an allocation scheme that achieves the bound. We discover that despite the inclusion of a switching cost the proposed allocation scheme achieves the same a...
متن کاملAsymptotically Efficient Allocation Rules for the - Multiarmed Bandit Problem with Multiple Plays - Part 11 : Markovian Rewards
At each instant of lime we are required to sample a fixed number rn 2 1 out of N Markov chains whose stationary transition probability matrices belong to a family suitably parameterized by a real number 8. The objective is to maximize the long run expected value of the samples. The learning loss of a sampling scheme corresponding to a parameters configuration C = (el,. .. , e, %*) is quantified...
متن کاملOptimal Adaptive Equal Allocation Rules
Suppose we wish to decide which of two treatments is better, where the outcomes are Bernoulli random variables, the success probabilities of which, themselves, are modeled as independent beta random variables. Assume that the maximal population size for the experiment is xed, but that the length of the study and the number and order of patients assigned to each treatment may be random. Our goal...
متن کاملLinear Parameter Estimation : Asymptotically Efficient Adaptive Strategies
This paper considers the problem of distributed adaptive linear parameter estimation in multiagent inference networks. Local sensing model information is only partially available at the agents, and interagent communication is assumed to be unpredictable. The paper develops a generic mixed time-scale stochastic procedure consisting of simultaneous distributed learning and estimation, in which th...
متن کاملAsymptotically Efficient Adaptive Allocation Schemes for Controlled I.I.D. Processes: Finite Parameter Space
Abstruct-We consider a controlled i.i.d. process whose distribution is parametrized by an unknown parameter 8 belonging to some known parameter space 8, and a one-step reward associated with each pair of control and the following state of the process. The objective is to maximize the expected value of the sum of one-step rewards over an infinite horizon. By introducing the loss associated with ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Advances in Applied Mathematics
سال: 1985
ISSN: 0196-8858
DOI: 10.1016/0196-8858(85)90002-8